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ABSTRACT 

Regular endurance exercise training induces beneficial functional and health effects in human 
skeletal muscle. The putative contribution to the training response of the epigenome as a 
mediator between genes and environment has not been clarified. Here we investigated the 
contribution of DNA methylation and associated transcriptomic changes in a weUs^cqntr^Jl^d 
human intervention study. Training effects were mirrored by significant al1|pr^|cffl|^f DNA 
methylation and gene expression in regions with a homogeneous r|^^l^ ^ergetics and 
remodeling ontology. Moreover, a signature of DNA methylation anj^g^expression separated 
the samples based on training and gender. Differential DNA>^ethi^tion was predominantly 
observed in enhancers, gene bodies and intergenic region^^jd 1^ in CpG islands or promoters. 
We identified transcriptional regulator binding ^||^iftf ||^||^RF, MEF2 and ETS proteins in the 
proximity of the changing sites. A transcrip^ion^^^twork analysis revealed modules harboring 
distinct ontologies and, interestingly, ^^pveralTOirection of the changes of methylation within 
each module was inversely correlated to^pression changes. In conclusion, we show that highly 
consistent and associated moll^il^^^ in methylation and expression, concordant with observed 
health-enhancing phenotypic adaptations, are induced by a physiological stimulus. 
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INTRODUCTION 

Endurance exercise training is a strong physiological stimulus that leads to a multitude of health 
and functional improvements when performed regularly. The health benefits include piKXention 
and/or treatment of a multitude of the most common diseases, e.g., cardiovascular ^ea^,1||)e 




process.^ Epigenetic modifications through DNA methylation regu^^Ktha gene transcription.^ 



II diabetes, and several forms of cancer.' The health benefits following exel^l^wm^amg are 
elicited by gene expression changes in skeletal muscle, which are fundam^al^ 1fie remodeling 

on re^^^^^ 

DNA methylation has traditionally been considered to be a nm^c^w stable modification that 
could change only over long periods of time, e.g., in disease development and during the 
general aging process. ' However, there igRijCTa|^iipg evidence that more short-term 
environmental factors can influence DNA n^^^j^ion. For example, dietary factors have the 



potency to alter the degree of DN^^^^gthylation in different tissues, ^''^ including skeletal 
muscle, while exercise is les^weU studied. In one study, a single bout of endurance-type 
exercise was shown to aff^^ipWli^tion at a few promoter CpG sites. In the context of 
diabetes, exercise trainin§^asi)een shown to affect genome-wide methylation pattern in skeletal 
muscle,'^ as w^Jlj^i^frajplpose tissue.'"* Together, those data indicate that physiological stressors 



can indeedgaifea^^^A methylation. 
D^^^k 



AoB^llj^^^^keletal muscle depends on successive transient increases in mRNA encoding 
reguMpry, metabolic, and structural proteins.'^ However, the acute changes in gene expression 
are quite different fi-om the more robust basal alterations that characterize a well-adapted muscle 
after a major lifestyle change, e.g., months of regular exercise. 

In order to obtain a deeper understanding of the mechanisms underlying the massive fimctional 
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and health benefits of regular exercise, we conducted a 3-month fully supervised human one- 
legged exercise training study. Training only one leg allowed for an intraindividual control, 
thereby excluding potential influence of diet, seasonal changes or unknown environmental 
factors, which are expected to affect both legs equally. We observed that the training intervention 



reshapes the epigenome and induces significant changes in DNA methylation, and the global 
fractions of decreased and increased methylation sites were similar. Importaa1ltV(Ml^es in 



DNA methylation were enriched in regulatory enhancer regions. Functional categ^ies related to 



SJ 

*fraPiui 



muscle biology (e.g., regulation of cellular carbohydrate metabolisir|[^^^|^lmiral remodeling) 
were overrepresented among differentially methylated sites. In^addition, a coordinated 
transcriptional and epigenetic response was identified throwh n^^ork analysis. 
Together, the findings from this tightly cont^^e4 j^^^m study strongly suggest that the 
regulation and maintenance of exercise t4fci||elda|l^ation is to a large degree associated to 
epigenetic changes, especially in regula^gfy enhOTcer regions. 
RESULTS 

Endurance exercise trainiiKn9%|0if global alterations in DNA methylation 

Twenty-three young volufiteeiB (Table 1), not regularly performing intense exercise, performed 
supervised onejggf^^^^e^tension exercise training for three months (45 min, 4 sessions per 
week; FiaiH^r training only one randomized leg (trained leg), while the second leg 

(u^pteed w^f was used as an intraindividual control leg. Skeletal muscle biopsies from the 
vastieLlateralis were taken from both legs at rest, before and after the training period (see 
Supplementary Methods S9 for details). Performance improvements and emjme activity 
increases in the trained leg confirmed that the training response was highly significant (p<10^. 
Figure lA-B and Figure SIB-C). 



4 



To address the effect of training on DNA methylation of specific sites across the human skeletal 
muscle genome we used the lUumina Infinium HumanMethylation450 arrays. Endurance 
training [after training (Tl) vs. before training (T2)] induced significant false discovery rate 
(FDR< 0.05) methylation changes at 4919 sites across the genome in the trained leg (Figure IC, 
Dataset S13). Of these, 839 sites had an absolute change of at least 5% in their mean methylation 
level (□ -value) after training, with a maximum of 9%. The correspondii^^|j^Sl||^tional 



analysis was performed using RNA sequencing, which identified 4076 c^ferent^IIy expressed 
genes (DEGs; Figure ID, Dataset S13). Ontology and pathway analy^j^^i^W^ showed that the 
transcriptional response robustly reflected key pathways invol^d it^raining adaptation and a 
trained muscle phenotype (Figure S2-S3 and Table S10).^lustwng analysis identified training 



and gender as the major determinants of varial^jjj^y ^^^psomal DNA methylation and gene 
expression data (Figure IE). The clust^fe^% ^dependent of gene expression on sex 
chromosomes, as a corresponding a^uping il^as obtained excluding genes on X and Y 
chromosomes (data not shown). Mj^oyeJ'a complementary approach revealed that over 600 
CpG sites correlated to the ^l^ell^jj^citrate synthase activity, an objective measure of training 
response (Figure S4 and^ata^t S14). This might imply that some of these sites could influence 
the degree of traini^^^^raS 
Global cytogin^nMiylation was also analyzed to look for globally unidirectional changes by 
lui^i^mel^^pKthylation assay (LUMA),'^ revealing no change in overall methylation levels 



with fining, and no difference between the trained and the untrained leg (Figure S5A). Global 
CpG methylation was remarkably similar both within the same subject and between all subjects. 
This is consistent with the specific observations from the array data of approximately equal 
number of sites increasing and decreasing in methylation and no significant change in the sample 
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average methylation after training. Analysis of global hydroxymethylation levels resulted in no 
overall change with training (Figure S5B). 

As expected by a physiological environmental trigger on adult tissue, the observed effect size on 
DNA methylation was small in comparison to disease states such as cancer.^ Hence, tt^exclude 
possible technical artifacts leading to false positive associations, we selected several genomic 
positions for bisulfite pyrosequencing validation. Overall, the methylation le^j^^^||iBcantly 



correlated with the 450K array data (Fig. S5C). Out of seven selected sites, w^observed the 



sites, w 



same direction of change for six positions (up/down/no change), thi|[^jgp^H^ support on the 
reliability of our data. The untrained leg was also included uiriie validation to verify that the 
observed changes were indeed an effect of the training its^ano^ot due to other environmental 
effects. No significant change was detected in l^^^^t^^4 leg. For details, see Figure S5D-J 
and the corresponding text. Moreover a Q-^Wi|^k tB^ array results is showed in Figure S8. 
Enrichment of training-induced DMj^j^^ ennxicers 

Annotation of the differentially methylated positions (DMPs) revealed a preferential localization 
outside of CpG Island/SheK^S^p^. When considering the positioning of the DMPs with 
respect to the genes, we could also detect an enrichment of the relative fraction of DMPs in gene 
bodies and interggl|i|^ejlon^nd fewer DMPs in promoter regions (TSS200), as compared to 
the relativp_di^^i1^tion of the probes present in the array (Figure 2A-B). Given these 
ob|Jl|^yonl^y<^ asked whether the major source of the DMPs could originate from enhancers. 
Indee^ using the standard array annotation, DMPs were significantly enriched in enhancers 
(Figure 2C). This finding was fijrther corroborated using external sources of data and annotations 
(see Supplementary Methods S9). For example, when defining enhancers based on histone marks 
(either H3K4mel, H3K27ac, or H3K4mel and H3K27ac jointly) and using completely 
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independent data from human skeletal muscle tissue, we also confirmed that the changing sites 
were observed predominantly in enhancers (Figure 2D). Additional evidence came from the 
definition of chromatin states by a chromatin segmentation algorithm and the corresponding 
public data available for human myoblast cultures . Here we observed an enrichmen^ relative 
to the array background, for strong and weak enhancers, but not for promoters. We also observed 
a slight preferential enrichment in regions of accessible chromatin, as de^^^^S^^^ase 
Hypersensitive Sites (Figure S6). These observations have potential implications^n the role of 
DNA methylation on regulating gene expression, a phenomenon lika^^Mflrafed by altering the 

1 8 20 ^^^^ ^^^^^ 

methylation status of enhancers and other regulatory elements. ^ 

The finding that endurance training especially influences^jphail^rs is indubitably novel in the 
context of tissue adaptation to a physio logical^ginw^^*' humans. Therefore, we sought to 
identify the biological processes correspon()(fc^)^i^-egulatory regions. 
Differential methylation is related t^^j^es g^fcming muscle related processes 

To assess the putative physiological ^evance of the DMPs, we performed a functional 
annotation and ontology analysis, where we defined an association rule that 

considered gene regulatory domains (see Supplementary Methods S9). Indeed, we detected a 
clear enrichment^o^^u^le ontology related processes for the genes in the vicinity of DMPs, 
demonstr^Qg Ij^^he top enriched categories of molecular function, biological process and 
'^^^'Nte^P^ii^^^t were linked to myogenesis as well as muscle structure, function and 
bioen^getics, in concordance with a trained muscle phenotype (Figure 3). This prompted to a 
deeper characterization of DMPs, by extending bidirectionally the sequence of the position (± 
100 bp) and examining the enrichment of known transcriptional motifs. Concordantly, we 
observed a significant enrichment for binding motifs of the MRF (Myogenic Regulatory 
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Factors), and MEF2 (Myocj'te Enhancer Factor 2) classes in DMPs increasing in methylation 
after training (Figure 3 and Table Sll). Conversely, for DMPs that decreased their methylation 
level, an enrichment for the ETS family binding domain was observed (Figure 3 and Table Sll). 
The ETS transcription factors have been strongly implicated for example in angiogenesig,^^ a key 
process in endurance exercise adaptation. 

Specific differential methylation correlates with expression changes uf^ad^M^rcise- 



is,^^ a ke-' 



dependent manner ^ 
The finding that differential DNA methylation is observed at enhanj^^ji^^regulatory regions 
with the associated enrichment of muscle ontology processes s<^^est||^ fimctional link between 
DNA methylation and RNA expression. To integrate thos^la^o OTka types, we initially performed 
correlation analysis, in order to highlight pai^|^f^^^^)^ethylation positions that showed 
dependence after training. Overall, when all^^^|;^g^sition pairs were considered, the Spearman 
correlation was centered on zero 4A^i-evealing no sign of global trend. However, 

selection of the pairs resulting from a DMP and a DEG revealed that an effect of exercise 
training exists in specific glfcn^^jlgions, indicated by peaks of both positive and negative 
correlation (Figure 4A) ^at were preferentially observed according to the relation to gene 
regions.^ In i^i l^iBlaP negative correlation was more prominent for probes in 
promoter/^HJT^^^xon regions, while gene bodies had a stronger peak of positive correlation 

In pr^ciple, since we are analyzing longitudinal observations on the same subject before and 
after training, any detected pattern of correlation between expression and methylation could arise 
from the individual levels of methylation and gene expression being correlated at the subject 
level and not as a result of the experimental intervention. Indeed, a Q-Q plot (Figure 4C) 
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revealed that, after excluding DMP/DEG pairs, some residual correlation remained and this 
behavior is most likely explained by the presence of some baseline level of correlation between 
DNA methylation and gene expression, that is not explained by training, but instead by inter- 
individual factors (e.g., gender). However, when considering only the DMP/DEG pairs, we 
reduced the number of false positives or, in other words, we decreased the possibiUtv lha^j^e 
dependence is not explained by the endurance training. This phenomenon is fiij|^l|^|a^jp^lified 
by a starburst plot that illustrates the relationship between DNA methvlation ^d expression 
changes (Figure 4D). This integrative analysis identified 255 |l^p|^lfrated genes with 
significant increase in methylation, 203 upregulated genes witlk significant increase in 
methylation, 273 upregulated genes with significanl^|^eci^e in methylation, and 70 
downregulated genes with significant decrea^^uv j^^^lation. Those gene/position pairs 
correspond to interesting examples of chai^fc^hlbeafcy the training-specific intervention and, 
therefore, provide the fraction of gen g|^h ose^l^)ression-methylation correlation is changed in 
different directions. Individual e^affiS l^ ^ ected among the top-correlated genes are given in 
Figure 4D. We found sevf|^ ^msjj^ting genes in the context of muscle physiology and 

lu» the 



metabolism. Examples wcluofc the MIPEP gene that encodes for a mitochondrial peptidase. 



Ih^ nS^a 



primarily involvedlm^^ i3|^ation of proteins involved in oxidative phosphorylation,^^ and 
GRK5, a GP^R^Jiase involved in multiple biological processes, proposed as a positive 
regji^^r^^^^sulin sensitivity in mouse.^^ In turn, some of the highly correlated 

gene/^ethylation probes were not differentially regulated after training, including many HLA 
genes and other genes whose expression could possibly be influenced by individual genotypes 
and not by the experimental intervention. This is exemplified by THNSL2 (in the bottom left 
panel of Figure 4D), whose bimodal expression has been linked to a cis-eQTL in muscle tissue. 
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For more examples, see Table S12. 

Integrating transcriptional network analysis with DNA methylation identifies a 
coordinated training response 

Having established that DNA methylation alterations are induced by training and are Jinked to 
corresponding genes that are also regulated by exercise training, we asked whethep -vj^ c<^d 
visualize epigenomic changes superimposed on a reconstructed transcriptional|[^a^^OT^g[^[ence, 
we reverse-engineered a mutual information-based transcriptional network (p<]^^) using count 
data from DEGs alone and, in a second independent step, we overlaicjj^Si^N^nethylation data 
by summarizing the methylation changes of the linked DMPs (Iwure |^ This analysis uncovered 
three major network domains and some smaller modilfcs (Figures S7) that were grouped 
according to their relative fold change (T2 v^[^l^tilatl^)rresponded to distinct ontologies. 
Domain A was overall downregulated (a^^^i^^e2FC=-0.31) and was populated with genes 
annotated for regulation of gene expr^|^, D^K replication and cell cycle. Examples include 
MDM2, a E3 ubiquitin-protein ligase that mediates ubiquitination of p53, ZNF638, a 




transcription factor associat3i^o^|g(R?G expression, UACA involved in regulation of stress- 
induced apoptosis andCL/^, part of an E3 ubiquitin-protein ligase complex and previously 




shown to decr e^e ^ijr' ep)reWon with endurance training. At the other extreme, domain C 
contained ^uprq^^l^ed genes (average log2FC=0.60) involved in morphological changes, 
indlpH^jg^^U^dhesion, blood vessel development and extracellular matrix organization. This 
doma^ included genes that have been shown to increase in expression with training, such as 
several coUagens (e.g., C0L4A1 and COL4A2), the protease inhibitor A2M, the adhesive 
glycoprotein gene THBS4, and PDGFRB involved in blood vessel formation. Domain B 
contained a majority of upregulated genes, with an intermediate magnitude as compared to 
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domain C (average log2FC=0.25). This domain was clearly associated with cellular energetics, 
mainly oxidative phosphorylation. Many mitochondrial enzjmies were found in this group, 
including CS, SDHA, HADHA, COX7A2, ATP5B and several NADH dehydrogenases. 
Importantly, the independent integration of DNA methylation information on toj^ of the 
transcriptional network identified a consistent pattern of inverse changes, possibly reflecti^ a 
coordinate transcriptional plan (Figure 5 and Dataset SI 3). In fact, the upregulatj^!(^^^|^^ was 
associated with a significant decrease in DNA methylation, whereas the downregulated domain 
A had almost entirely corresponding positive changes in DNA jj^pjplra. Concordantly, 
domain B contained a comparable fraction of genes with cojrespoiiding positive or negative 
changes in DNA methylation. Interesting genes whose i|jgth^Ujon changes were opposite to 
those of transcription included the coUagens CQ^jU^ndjCOL4A2, and laminin LAMA4 that 
form the basement membrane around sM^|gl^\ii^e cells. For oxidative metabolism, the 
enzjmies MDHl and a NADH dehydratase of^e electron transport chain NDUFA8 represent 
interesting examples. In domai^A^ybe Jnyosin phosphatase PPP1R12A and TRDN, likely 
involved in regulation of qp^il^^ase from the sarcoplasmic reticulum, both increase in 



methylation and decrea3|mn expression. 
DISCUSSION (^jC 

From this w^l-^j^n^oUed, prospective and extensive human study we were able to create a map 
°^iJ^''^itE™^^*^^^^t^^t and biologically relevant DNA methylation changes in skeletal muscle 
tissue^in response to a hfestyle intervention known to improve function and health. The 
significant changes in DNA methylation, that primarily occurred in enhancer regions, were to a 
large extent associated with relevant changes in gene expression, even if no causal relationship 
could be definitely determined. 
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The main findings of this study were that three months of endurance training in healthy human 
volunteers induced significant methylation changes at almost 5000 sites across the genome and 
significant differential expression of approximately 4000 genes. The genes associated with the 
DMPs that increased and decreased in methylation, respectively, with training, represent distinct 
ontologies. DMPs that increased in methylation were mainly associated to structural remodeling 
of the muscle and glucose metabolism, while the DMPs with decreased were 
associated to inflammatory/immuno logical processes and transcriptional reflation. This 
suggests that the changes in methylation seen with training were no^j^^^Ni^ffect across the 
genome but rather a controlled process that likely contributes^P skeletal muscle adaptation to 
endurance training. ^ 

CpG methylation is subjected to spatial (tis^^ pr-ceU^specific methylation) or temporal 
variability (age-dependent, disease-assl^5l|^\ environmental-mediated differential 
methylation). In order to eliminate co^Dunding«itfactors, we formulated a paired study design, 
where two samples from the saio£^ |^i\j mial were contrasted before and after the training 
intervention. We verified, fqp^^^i^^sites, that the observed effect, despite small, was specific 
for the trained leg, corrworamg the hypothesis that the measured changes in DNA methylation 
represent explicit »flgc#la^panifestations of exercise training. Unlike certain diseases, e.g. 
cancer, envjr<!|uneilally induced changes in DNA methylation are typically small, and 
sui*l^|tibll^^arge individual variation. ^'^^ By employing an intraindividual control leg for 



valid^on of methylation and physiological changes, we minimize any influence of diet or other 
environmental factors that may otherwise be a challenge in environmental epigenomics.^^'^° Here 
we successfully report the identification of significant and ontologically coherent changes in 
DNA methylation in a human genome-wide study of the effect of endurance exercise in muscle 
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tissue, which was made possible by the combination of a well-controlled, prospective study 
design with a comprehensive, integrative bioinformatic analysis. 

A well-known source of heterogeneity in DNA methylation studies is represented by the cell 
composition of the analyzed sample. The epigenome varies between cell types and corr^ting for 
cellular composition has been shown to reduce confounding due to cell type bias in blood^^' 



♦ <1 



Nevertheless, for complex solid tissue samples, the correction for cell cojjjf^ii||g/is not 

straightforward, as it would require the isolations and profiling of individual ^11 types. An 



increased vascularization following endurance exercise results ^^^^(f^^j/^^fST endothelial cell 



iviauai 
mjISr e 



content per skeletal muscle fiber. In line with this, we confirrned that some exclusive and non- 
exclusive endothelial markers were differentially expre$|gd a|pr training (data not shown). 
Hence, we cannot completely rule out the possibilit;^that, a minor fraction of the observed 
changes in DNA methylation and/or gene d^^l|j|gi\}^uld reflect changes in cellular fractions. 
Correlation of the changes in DNA m^hvlation^ the changes in gene expression showed that 
the majority of significant meth;^^JgJjo^x^ssion pairs were found in the groups representing 
either increases in expressi^^^f^^^oncomitant decrease in methylation or vice versa. The 
fraction of genes showii|f bom significant decrease in methylation and upregulation was 7.5% of 
the DEGs or 2.3°Ai^s§^ ^es detected in muscle tissue with at least one measured DNA 

Correspondingly, 7.0% of the DEGs or 2.1% of all genes showed both 
sig^i^aCTiliipfase in methylation and downregulation. This could reflect the classical view on 
prorn^er methylation with a reciprocal relationship between methylation and expression. The 
pairs showing a concordant change were skewed in the positive direction. In fact, 5.6% of the 
DEGs or 1.7% of all genes were significantly changing in the positive direction, whereas only 
1.9% of the DEGs or 0.6% of all had significantly negative changes. In total, we show that DNA 
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methylation changes are associated to gene expression changes in roughly 20% of unique genes 
that significantly changed with training. 

One previous study has investigated genome-wide alterations in DNA methylation after long- 
term training using MeDIP-ChIP and related those to gene expression changes by micr^rrays. 
Various methods are now available for determining CpG methylation status, eachpotentially 
exhibiting advantages and limitations and differing in the ability to d|J;^^^S|i£iwential 



methylation. ' " . We employed a method (450k arrays) that quantifies methy^tion levels at 



intiiies metny 
l|^r^^Wf s 



specific loci and does not require correction for CpG bias, coupl|P^^|||KWfv-seq, in order to 
uncover the connection between the epigenetic and transcri^iona^esponses, and therefore 
obtain a deeper understanding of the impact of regular ex^ise^lPie methodological differences 



render a direct comparison more difficult, howo^,^^h studies show reduced methylation of 
some genes involved in muscle physio log5(|jS|^t^\i^We RUNXl and C0L4A1. In contrast, for 
the gene THADA also highlighted in Nkert et. ai^^^ we observe an increased methylation of two 
CpG sites. Our sites are, howeverjoolfced^the gene body, while changes in Nitert et. al occur 
in promoter regions, sometn^^tlli^jj^ly produces different effects on transcription. 
The network reconstnmfcn^sinted in domains classified mainly as structural, metabolic and 
regulatory and swmmKdk ^coordinated pattern of change in DNA methylation and gene 
expression^^^ ^^plj B of structural genes include C0L4A1, COL4A2 and LAMA4. These genes 
ha^J^^o^^^j^dentified as important for differences in responsiveness to endurance training,^^ 
wher^methylation status could be part of the mechanism behind variable training response. 
Among the metabolic genes, MDHl catalyzes the reversible oxidation of malate to oxaloacetate, 
utilizing the NAD/NADH cofactor system in the citric acid cycle and NDUFA8 plays an 
important role in transferring electrons fi-om NADH to the respiratory chain. Regulatory genes 
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downregulated with training include MDM2, targeting p53 for proteasomal degradation, and 
PPP1R12A, a subunit of myosin phosphatase that is also capable of inhibiting HIFIAN- 
dependent suppression of HIFIA activity. The resulting inhibition of HIFIA could be an 
advantage for endurance adaptation as HIFIA has a negative effect on oxidative metaboUsm. ^^'^^ 
Recently it was shown that enhancers have dynamic methylation and that their methylation 
levels are more closely associated with gene expression alterations than promo^pj^^^j^^ion in 



cancer/ In the present study, methylation predominantly changed in^nhancqj^ regions with 



£nnanc( 



enrichment for binding motifs for different transcription factors|j|^p|^I^ that enhancer 
methylation may be highly relevant also in exercise biology. Regiojas with sites increasing in 
methylation were enriched for myogenic regulatory fadlj^s ol^j^lRFs and myocyte enhancer 
factors, or MEFs. The MRFs include Myf5, My^^^ ifty^a^jn, and Mrf4. These are basic helix- 
loop-helix transcription factors of the m^^^imi^li%age that control the determination and 
differentiation of skeletal muscle cell 9%p the dUibryo Of special interest in the biology of 
endurance training may be that Ml^iiSj^llrough binding to the PGC-la core promoter, can 



regulate this well-studied cqp^^^^jjf mitochondrial biogenesis."^' Animal studies indicate that 
MyoD promotes slow-*jl^^|fiber transformation while myogenin could possibly work as an 
ultimate transcript^y ^^fay or^inding directly to promoters of mitochondrial enzjmies, thereby 
promotin g ^ oj^jpidative profile of muscle tissue. 

Th^lll^jo^i^^phancer factor 2 (MEF2) transcription factor links, partly through regulation of 
class ^la histone deacetylaces (HDACs), extracellular signals to the activation of genetic 
programs involved in cell differentiation, proliferation, morphogenesis, survival and apoptosis in 
many different cell types."^^ KO experiments suggest that some of the MEF2 isoforms promote 
slow fiber phenotype during development."*^ 
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That endurance training led to an increased methylation in enhancer regions containing motifs 
for the MRFs and MEFs is somewhat counterintuitive since it should lead to the repression of the 
action of the above discussed transcription factors. Possible explanations include either: a) a 
dynamic regulation of enhancer accessibility and activity, involving active methylation as an 
inactivation mechanism at the time (i.e. three months) when morphological changes have already 
taken place and a negative feedback should be provided; b) a confounding effec^ifcerafcMiweased 



representation of different cell types after training, which would result Jn the ^election of a 



'i 



change in methylation because of the under Ijdng differences in meth|^^p|^B^een muscle and 
non muscle cell types and c) possible incompletely undej^ood^henomena such as the 
differential methylation that mhibits transcription of an^|jhiD!|pry RNA molecule, as it was 
shown for pi 5."*^ However, these factors regulal^^ lar^^jjnber of genes and there are several 
that decrease with training in this stud5»|clilii^ CDCH15, MYH3, TNNT2, RYRl and 




SH3GLB1, which would be expect ^t^fr om m increased enhancer methylation. Also, the 
expression of MEF2A itself decrea|eAtityffaining. 

Regions with sites decreasi^^n^Ktjjiylation with endurance training were enriched for the ETS 
family of transcription wtor^^ large group with a well-conserved binding motif A reduction in 
methylation with ti^^^^mS^ancer regions containing this motif potentially allows for further 
activation Q^th^^^sponse genes. There are several members of this group that are known to 
reg^j^^^^S^^jjife-responsive genes, thus proposing an epigenetic regulation with endurance 
trainij^. GABPA, also known as nuclear respiratory factor 2, controls expression of several 

volved in mitochondrial respi 
the transcriptional coactivator FGCla."^^ Another interesting example is ETSl, that targets genes 



genes involved in mitochondrial respiration in human skeletal muscle ^'^ and is itself regulated by 



important for endothelial migration e.g. matrix metalloproteinases, angiopoietin 2 and the 
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vascular endothelial growth factor receptor 2 (KDR),^^ which increase in expression in this study. 
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ELFl targets TIEl and 'nE2, also important factors in angiogenesis known to increase with 
endurance training in human skeletal muscle."*^ The ETS family of transcription factors have 



been shown to interact with RUNXl which has been identified as one potential key 
transcription factor in the regulation of the endurance training induced transcriptome.^^.^^e 
mechanisms responsible for inducing these training specific effects are yet ^l^^^^inhed. 
However, altered metabolism is known to induce epigenetic changes and metab^ites from the 
citric acid cycle such as a-ketoglutarate are used as substrates for se^^^i^He enzymes known 
to catalyze methylation reactions."^^ ^ 
In conclusion, this study demonstrates that the transcrip1^|jial^|ferations in skeletal muscle in 
response to a long-term endurance exercise mter^^^ioa^re coupled to DNA methylation 
changes. We suggest that the training-indl^5j^i^c\)rabated epigenetic reprogramming mainly 
targets enhancer regions, thus contri l3l|^in g to ^fferences in individual response to hfestyle 
interventions. We provide a valuaJalg Vad J ovel perspective on the fields of human physiology 
and environmental epigenopfcs^fiifl^ing that a physiological health-enhancing stimulus can 
induce highly consistM rnqj^incations in DNA methylation that are associated to gene 



change^^^^^dHt with observed phenotypic adaptations. 



expression 
MATERIALS AND METHODS 



[<||ll||y^n|Hiinogical measurements 

The s^dy, labeled EpiTrain ("Epigenetics in Training"), was approved by the Ethics Committee 
of Karolinska Institutet and conformed to the Declaration of Helsinki. Twenty-three young, 
sedentary volunteers (Table 1) trained only one randomized leg during three months, and the 
other leg was used as an untrained intraindividual control leg. Two one-legged knee-extension 
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performance tests were conducted before and after the training period. Skeletal muscle biopsies 
from vastus lateralis were taken before and 24h after the last training session from both legs. The 
post-training performance tests were conducted 3-6 days after the biopsies. Enzyme activity 
assays included citrate synthase (CS) and P-HAD activity. For details, see Supplementary 
Methods S9. 

DNA methylation methods 

The total amount of DNA methylation (at CCGG sites) in the genome, b^ore anc^after training 
in both legs, was analyzed with LUMA (LUminometric Methylati<(^!Ijil|^?J?^ Genome-wide 
DNA methylation profiling was generated with the Infinium Hu^^^J^hylation450K BeadChip 



ilementar 



t^Lbei^ 



array on bisulfite-treated DNA from biopsies coUect^lLbefJ^ (Tl) and after training (T2, 
trained leg), for 17 subjects in total. Bisulfl^j^jt-cpeafencing was adopted for technical 
validation. We analyzed global hydroxymethyl^^^ wth a colorimetric antibody-based method 



l^^^wtl 

n^ufacti 



from Epigentek according to the n^i^facturef s specifications. More details are given in 
Supplementary Methods S9. 

Bioinformatics analysis of IK^^^iilBylation data 



We employed a pre-prdS^sinh and normalization pipeline as reported previously.^'. Color-bias 
adjustment, quaitj^f^JkalKation, probe type bias adjustment and batch correction were 
performe^f^r'^jjllier details see Supplementary Methods S9 and Marabita et al.,^' where the 
cu^^^lfcd^d^was named "Dataset B". Differentially methylated positions (DMPs) were 
defini^ using limma on M values,^^ including the group (T2 vs. Tl) and the subject as 
covariates, in order to account for the paired design. DMPs were selected if FDR<0.05. For 
sample clustering, principal component analysis was performed using only data from the DMPs. 
Standard lUumina annotation was used to annotate methylation data. Additional sources included 
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NIH Roadmap Epigenomic (http://www.roadmapepigenomics.org) data for skeletal muscle and 
ChromHMM chromatin segmentation tracks for HSMM cell line in Encode 
(http://encodeproject.org). For each annotation category the relative fraction of positions located 
within each feature type was calculated for DMPs, non-DMPs and the entire array.^GREAT 
(http://great.stanford.edu) was used to discover functional categories associated with DMPs.^^e 
enrichment for known transcriptional motifs was tested ig^k^kpPMER 



(http://homer.saIk.edu/homer/) and motifs were clustered^ usiq^ STAMP 
(http://www.benoslab.pitt.edu/stamp/index.php). Full details are Supplementary 
Methods S9. 
RNA sequencing 

Total RNA was used to prepare libraries, whicl^keil iPqlfeiced as paired-end, 2xl00bp on an 
lUumina HIseq2000 and generated an averag^:£2»mmion paired-end reads per sample. 
Briefly, we performed quality contro3|ni™ii^5 mapping, PGR duplicate removal and gene 
count summarization. EdgeR was used to normalize the data and extract differentially 



used for sampl e ^li «t(my . ^ne ontology analysis was done taking length bias into account, as 



expressed genes (DEGs). \\f^ms^||j|^ the group (T2 vs. Tl), the library preparation and the 
individual as covariates. DEG were selected if FDR<0.05. Multidimensional scaling plots were 

implemented, iq^g^q,^^ and enriched KEGG pathways were visualized with pathview.^" Full 
^^HP^y^^^ii*^ ^ Supplementary Methods S9. The baseline muscle transcriptome has been 
descr-^ed elsewhere,^^ where a subset of the samples before training (n=12) has been analyzed. 
Integration between DNA methylation and transcriptomics 

A list of genes and the corresponding measured DNA methylation positions was obtained and 
correlation was calculated for each pair. The distribution of Spearman rho statistics between 
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DNA methylation and gene expression was calculated either including all pairs of 
genes/methylation positions or only pairs formed by a DMP and a DEG. Transcriptional 
networks were reconstructed from gene expression data applying the Mutual Information (MI) 

CO 

method developed in ARACNE. The resulting network components were visualized and 
analyzed with Cytoscape. For each independent network domain, gene ontology was tested using 
the BiNGO tool (http://www.psb.ugent.be/cbd/papers/BiNGO/Home.htnil).^1|^S|^tional 
information see Supplementary Methods S9. 
Data availability 

Data are available on GEO under the accession numbers ^^^^^1^55 (DNA methylation), 
GSE58608 and GSE60590 (gene expression). 
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Figure 1. Effects of endurance exercise training are mirrored by alterations in DNA 
methylation and gene expression. 

The significant (* = p<10^) effect of training in the trained leg is shown by the increa^ in a 15 
min performance test (A) and the citrate synthase activity in muscle biopsies (B) in T2 (a^r 
training) vs. Tl (before training). For comparison, the untrained leg is also^l^^^^lliere a 



smaller change is observed in performance and no change in CS activity. # indic^es significant 
differences between the changes in the two legs. Additional phys^^f^^J^^^asurements are 
shown in Figure SIB-C. The physiological changes are mirmred ^ modifications in DNA 
methylation (C) and gene expression (D). For DNA metl^atiOT^the effect size is measured as 
the difference in M-values and points in black mrrespflncLto DMPs with FDR<0.05. For gene 
expression, the log2(Fold Change) is plotted|^|j^\tffl^average log2(Counts Per Million) and red 
points correspond to genes with FDR^^S. Col»elation between changes in DNA methylation 
and CS activity exists and results wo\jF in S5. The clustering of the samples is shown in E 
using either DNA methylatjiji ^fcn|f panels) or gene expression (lower panels). A segment 
connects to measureme^ fr^j the same subject obtained before or after training. Samples are 
alternatively color ^|^^^ r3%) (Tl=blue, T2=red) or by gender (M=green, F=magenta). For 
DNA meto j^ tij^n^rincipal Component Analysis was employed using only autosomal DMPs, 
wl||ii^^^^g|^j^xpression the top 1000 genes with largest biological variation were chosen and 
the biological coefficient of variation used to produce a multidimensional scaling plot. 
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Figure 2. DNA methylation changes are primarily localized in enhancers. 

(A-C) For each annotation category, the relative fraction of positions located within each feature 
type is calculated for DMPs (red bars), non-DMPs (blue bars) and the entire position on the array 
(green bar). Panels A-B were obtained using lUumina annotation, while for panel C^e used 
lUumina annotation (Enhancers) or publicly available ChlP-seq experiment on HSMM cells from 
which we derived H3K4mel and H3K27ac peaks. The presence of H3K27ac^jl^^^^^^tes an 
active regulatory element, while H3K4mel+H3K27ac marks active enhancers a^ other distal 
elements. (D) The log2 fold enrichment for DMPs vs. the array wajj^^j^f^!^ for the relative 
fraction of probes falling in each category; data from chrorqatin segmentation algorithms in 
HSMM cells were used. For additional informatioiM^see^upplementary Methods S9 . 
Significance codes: * p<0.05; *** p<0.001; Fisl^m^s ©x^^tst^ 
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Figure 3, Muscle related processes and factors are enriched in the up-methylated DMPs, 

GREAT analysis was performed to retrieved flinctional categories associated with DMPs 
increasing or decreasing methylation after training. Up to top five categories passing the 
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threshold (see Supplementary Methods S9) are shown for GO Molecular Function, Biological 
Process and Cellular Component. We tested the presence of known enriched motif on a 
S5niimetrical 200bp window around each DMP (p<10"^'^, consensus motif shown). Known 
profiles were clustered and a familial logo was drawn. For a corresponding ontology analysis of 
gene expression, see Figures S2-S3. 
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Figure 4. Identification of regions exhibiting positive or negative correlations between 
methylation and expression, 

(A) Distribution of Spearman correlation between DNA methylation and gene expression 
calculated either including aU pairs of genes/methylation positions (grey line) or only pairs 
formed by a DMP and a DEG (black line). The two distributions are strongly different (p< 2.2e- 
16 Kolmogorov-Smirnov Test) and peaks of positive and negative correlation are highlighted 
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after selecting changing sites after training. (B) A Q-Q plot was obtained by plotting the 
observed p-values from the above correlation analysis against those obtained under a uniform 
distribution. (C) The distribution of the Spearman correlation was stratified according to the 
genomic location of the DMP with respect to the linked gene. (D) A starburst plot illusjb-ates the 
relationship between DNA methylation and expression changes, with a color density 
representation of the scatterplot of the pairs of genes/methylation positions. T^^g^^^llll axis 
is the FDR for each gene multiplied by the sign of the fold change, while Ihe vert^al axis id the 
FDR for methylation multiplied by the sign of the fold change. GrqiJ^^^f^^^espond to pairs 
where FDR<0.05. Numbers on each quadrant show the percentage o^ositions (black) or genes 
(red) that fall on each region. Examples of expressioilJietH^i^tion pairs are given for the 
indicated probes and genes. Points represenl^^rnp^^^d are colored as shown on the 
corresponding legends (training: Tl=black,^S«jeAga|ider: M=blue, F=red). The correlation for 
the cases taken from the significant ^i|drants » visually explained by the training, while the 
correlation obtained for non-signific^^^a^ may be explained because of the individual levels 
of methylation and gene exOT^s]ltb|pig correlated at the subject level. 
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Fig 5. Transcriptional network analysis reveals coordinated alternations in methylation 
and gene expression, 

A transcriptional network was reconstructed using RNA expression data, showing three major 
domains (see Supplementary Methods S9 and Figure S7). The whole network is shown inside the 
rectangle at the bottom of the figure, where each node is colored according to the corresponding 
fold change. The three major domains (A,B,C) are zoomed and only genes with corresponding 
significant changes in DNA methylation are labeled. The color indicates whether the gene has 
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corresponding DMPs that significantly increase (red) or decrease (blue) their methylation levels. 
A dark grey color indicates that the gene has significant but discordant changes in DNA 
methylation. 



Table 1. Baseline subject characteristics for males (n=12), females (n=ll) and^J^bj^s 
(n=23). Data is presented as mean ± SEM. For details on the training program s^'^j|^itaj^ . 





Age 

(yrs) 


Height 
(cm) 


BMI ^ 

(kgW) 


Pe^VOi 
7ml*kg-^*min"^) 


Males 


27.5 ± 0.97 


181 ±2 


24.8^^^ 


39.5 ± 1.1 


Females 


26.4 ± 1.31 


169 ±2 


1^^0.9 


38.4 ± 1.2 


All subjects 


27.0 ± 0.79 


175 ± 2 ^^/^ 


^.0±0.8 


39.0 ±0.8 
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